Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kailong Wang

Understanding Safety-Sensitive Expert Behavior in Mixture-of-Experts LLMs

May 28, 2026

Zhibo Zhang, Yuxi Li, Zhen Ouyang, Ling Shi, Kailong Wang

Abstract:Mixture-of-Experts (MoE) LLMs rely on sparse, router-driven expert activation, yet how safety alignment interacts with routed expert specialization remains underexplored. A common intuition is that safety behavior may be controlled by routing harmful requests to distinct refusal-oriented experts. In this work, we provide empirical evidence for a different picture: routing patterns in aligned MoE LLMs are largely topic-driven, while safety behavior can be altered with little change to the model's intrinsic routing path. Motivated by this observation, we present **RASET** (**R**outer-**A**gnostic **S**afety-critical **E**xpert **T**uning), a red-teaming framework that probes safety enforcement that is localized in a small subset of experts while preserving the model's intrinsic routing behavior. **RASET** identifies safety-critical experts via a contrastive routing-sensitivity criterion and applies parameter-efficient tuning only to the selected experts, minimizing semantic disruption relative to router-steering interventions. These results reveal a distinct MoE safety risk, highlighting the need for expert-aware alignment mechanisms.

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

Map2APS: A Physically Grounded Benchmark for Direct Angle Power Spectrum Prediction from Urban Geometry

May 14, 2026

Junxi Huang, Xiucheng Wang, Nan Cheng, Kailong Wang, Ruijin Sun, Zhisheng Yin

Abstract:Angle power spectrum (APS) characterizes the directional distribution of received signal power and is directly relevant to beam management and MIMO processing. While environment-aware learning has been widely studied for radio maps and path loss, direct map-to-APS prediction still lacks a standardized large-scale benchmark. This paper presents Map2APS, a physically grounded benchmark constructed from intelligent ray-tracing (IRT) path-level propagation records. Map2APS covers 51 equal-height urban maps and approximately 2.55 million Tx--Rx samples, with a strict cross-map split for evaluating generalization to unseen urban layouts. We benchmark representative model families and introduce MS-AReg as a strong reference baseline. On the full held-out test set of 249{,}993 samples, MS-AReg achieves a cosine similarity of 0.948, a peak location error of 1.20$^\circ$, and an inference latency of 0.101 ms/sample. We further report dominant-direction metrics, including Top-1 dominant peak hit rate and dominant peak recall, to evaluate whether predicted spectra preserve decision-relevant arrival directions. The benchmark, code, and evaluation scripts are released at https://github.com/UNIC-Lab/aps-data.

* Submitted to IEEE GLOBECOM 2026

Via

Access Paper or Ask Questions

UNSEEN: A Cross-Stack LLM Unlearning Defense against AR-LLM Social Engineering Attacks

Apr 25, 2026

Tianlong Yu, Yang Yang, Xiao Luo, Lihong Liu, Fudu Xing, Zui Tao, Kailong Wang, Gaoyang Liu, Ting Bi

Abstract:Emerging AR-LLM-based Social Engineering attack (e.g., SEAR) is at the edge of posing great threats to real-world social life. In such AR-LLM-SE attack, the attacker can leverage AR (Augmented Reality) glass to capture the image and vocal information of the target, using the LLM to identify the target and generate the social profile, using the LLM agents to apply social engineering strategies for conversation suggestion to win the target trust and perform phishing afterwards. Current defensive approaches, such as role-based access control or data flow tracking, are not directly applicable to the convergent AR-LLM ecosystem (considering embedded AR device and opaque LLM inference), leaving an emerging and potent social engineering threat that existing privacy paradigms are ill-equipped to address. This necessitates a shift beyond solely human-centric measures like legislation and user education toward enforceable vendor policies and platform-level restrictions. Realizing this vision, however, faces significant technical challenges: securing resource-constrained AR-embedded devices, implementing fine-grained access control within opaque LLM inferences, and governing adaptive interactive agents. To address these challenges, we present UNSEEN, a coordinated cross-stack defense that combines an AR ACL (Access Control Layer) for identity-gated sensing, F-RMU-based LLM unlearning for sensitive profile suppression, and runtime agent guardrails for adaptive interaction control. We evaluate UNSEEN in an IRB-approved user study with 60 participants and a dataset of 360 annotated conversations across realistic social scenarios.

Via

Access Paper or Ask Questions

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

Apr 25, 2026

Tianlong Yu, Yang Yang, Ziyi Zhou, Jiaying Xu, Siwei Li, Tong Guan, Kailong Wang, Ting Bi

Abstract:The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented Reality (AR) glasses to capture a target visual and vocal data. A Large Language Model (LLM) then analyzes this data to identify the individual and generate a detailed social profile. Subsequently, LLM-powered agents employ social engineering strategies, providing real-time conversation suggestions, to gain the target trust and ultimately execute phishing or other malicious acts. Despite its potential, the practical application of AR-LLM-SE faces two major bottlenecks, (1) Cold-start personalization, Current Retrieval-Augmented Generation (RAG) methods introduce critical delays in the earliest turns, slowing initial profile formation and disrupting real-time interaction, (2) Static Attack Strategies, Existing approaches rely on fixed-stage, handcrafted social engineering tactics that lack foundation in established psychological theory. To address these limitations, we propose PhySE, a novel framework with two core innovations, (1) VLM-Based SocialContext Training, To eliminate profiling delays, we efficiently pre-train a Visual Language Model (VLM) with social-context data, enabling rapid, on-the-fly profile generation, (2) Adaptive Psychological Agent, We introduce a psychological LLM that dynamically deploys distinct classes of psychological strategies based on target response, moving beyond static, handcrafted scripts. We evaluated PhySE through an IRB-approved user study with 60 participants, collecting a novel dataset of 360 annotated conversations across diverse social scenarios.

Via

Access Paper or Ask Questions

R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling

Apr 22, 2026

Aijia Cheng, Kailong Wang, Ling Shi, Yongxin Zhao

Abstract:Function calling empowers large language models (LLMs) to interface with external tools, yet existing RL-based approaches suffer from misalignment between reasoning processes and tool-call decisions. We propose R2IF, a reasoning-aware RL framework for interpretable function calling, adopting a composite reward integrating format/correctness constraints, Chain-of-Thought Effectiveness Reward (CER), and Specification-Modification-Value (SMV) reward, optimized via GRPO. Experiments on BFCL/ACEBench show R2IF outperforms baselines by up to 34.62% (Llama3.2-3B on BFCL) with positive Average CoT Effectiveness (0.05 for Llama3.2-3B), enhancing both function-calling accuracy and interpretability for reliable tool-augmented LLM deployment.

Via

Access Paper or Ask Questions

Verify Claimed Text-to-Image Models via Boundary-Aware Prompt Optimization

Mar 27, 2026

Zidong Zhao, Yihao Huang, Qing Guo, Tianlin Li, Anran Li, Kailong Wang, Jin Song Dong, Geguang Pu

Abstract:As Text-to-Image (T2I) generation becomes widespread, third-party platforms increasingly integrate multiple model APIs for convenient image creation. However, false claims of using official models can mislead users and harm model owners' reputations, making model verification essential to confirm whether an API's underlying model matches its claim. Existing methods address this by using verification prompts generated by official model owners, but the generation relies on multiple reference models for optimization, leading to high computational cost and sensitivity to model selection. To address this problem, we propose a reference-free T2I model verification method called Boundary-aware Prompt Optimization (BPO). It directly explores the intrinsic characteristics of the target model. The key insight is that although different T2I models produce similar outputs for normal prompts, their semantic boundaries in the embedding space (transition zones between two concepts such as "corgi" and "bagel") are distinct. Prompts near these boundaries generate unstable outputs (e.g., sometimes a corgi and sometimes a bagel) on the target model but remain stable on other models. By identifying such boundary-adjacent prompts, BPO captures model-specific behaviors that serve as reliable verification cues for distinguishing T2I models. Experiments on five T2I models and four baselines demonstrate that BPO achieves superior verification accuracy.

* Accepted to CVPR 2026 (Findings)

Via

Access Paper or Ask Questions

Low Overhead Channel Estimation in MIMO OTFS Wireless Communication Systems

Nov 11, 2025

Kailong Wang, Athina Petropulu

Abstract:Orthogonal Time Frequency Space (OTFS) modulation has recently garnered attention due to its robustness in high-mobility wireless communication environments. In OTFS, the data symbols are mapped to the Doppler-Delay (DD) domain. In this paper, we address bandwidth-efficient estimation of channel state information (CSI) for MIMO OTFS systems. Existing channel estimation techniques either require non-overlapped DD-domain pilots and associated guard regions across multiple antennas, sacrificing significant communication rate as the number of transmit antennas increases, or sophisticated algorithms to handle overlapped pilots, escalating the cost and complexity of receivers. We introduce a novel pilot-aided channel estimation method that enjoys low overhead while achieving high performance. Our approach embeds pilots within each OTFS burst in the Time-Frequency (TF) domain. We propose a novel use of TF and DD guard bins, aiming to preserve waveform orthogonality on the pilot bins and DD data integrity, respectively. The receiver first obtains low-complexity coarse estimates of the channel parameters. Leveraging the orthogonality, a virtual array (VA) is constructed. This enables the formulation of a sparse signal recovery (SSR) problem, in which the coarse estimates are used to build a low-dimensional dictionary matrix. The SSR solution yields high-resolution estimates of channel parameters. Simulation results show that the proposed approach achieves good performance with only a small number of pilots and guard bins. Furthermore, the required overhead is independent of the number of transmit antennas, ensuring good scalability of the proposed method for large MIMO arrays. The proposed approach considers practical rectangular transmit pulse-shaping and receiver matched filtering, and also accounts for fractional Doppler effects.

Via

Access Paper or Ask Questions

SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering Behaviors

May 30, 2025

Tianlong Yu, Chenghang Ye, Zheyu Yang, Ziyi Zhou, Cui Tang, Zui Tao, Jun Zhang, Kailong Wang, Liting Zhou, Yang Yang(+1 more)

Abstract:The SEAR Dataset is a novel multimodal resource designed to study the emerging threat of social engineering (SE) attacks orchestrated through augmented reality (AR) and multimodal large language models (LLMs). This dataset captures 180 annotated conversations across 60 participants in simulated adversarial scenarios, including meetings, classes and networking events. It comprises synchronized AR-captured visual/audio cues (e.g., facial expressions, vocal tones), environmental context, and curated social media profiles, alongside subjective metrics such as trust ratings and susceptibility assessments. Key findings reveal SEAR's alarming efficacy in eliciting compliance (e.g., 93.3% phishing link clicks, 85% call acceptance) and hijacking trust (76.7% post-interaction trust surge). The dataset supports research in detecting AR-driven SE attacks, designing defensive frameworks, and understanding multimodal adversarial manipulation. Rigorous ethical safeguards, including anonymization and IRB compliance, ensure responsible use. The SEAR dataset is available at https://github.com/INSLabCN/SEAR-Dataset.

Via

Access Paper or Ask Questions

Privacy Protection Against Personalized Text-to-Image Synthesis via Cross-image Consistency Constraints

Apr 17, 2025

Guanyu Wang, Kailong Wang, Yihao Huang, Mingyi Zhou, Zhang Qing cnwatcher, Geguang Pu, Li Li

Abstract:The rapid advancement of diffusion models and personalization techniques has made it possible to recreate individual portraits from just a few publicly available images. While such capabilities empower various creative applications, they also introduce serious privacy concerns, as adversaries can exploit them to generate highly realistic impersonations. To counter these threats, anti-personalization methods have been proposed, which add adversarial perturbations to published images to disrupt the training of personalization models. However, existing approaches largely overlook the intrinsic multi-image nature of personalization and instead adopt a naive strategy of applying perturbations independently, as commonly done in single-image settings. This neglects the opportunity to leverage inter-image relationships for stronger privacy protection. Therefore, we advocate for a group-level perspective on privacy protection against personalization. Specifically, we introduce Cross-image Anti-Personalization (CAP), a novel framework that enhances resistance to personalization by enforcing style consistency across perturbed images. Furthermore, we develop a dynamic ratio adjustment strategy that adaptively balances the impact of the consistency loss throughout the attack iterations. Extensive experiments on the classical CelebHQ and VGGFace2 benchmarks show that CAP substantially improves existing methods.

Via

Access Paper or Ask Questions

On the Feasibility of Using MultiModal LLMs to Execute AR Social Engineering Attacks

Apr 16, 2025

Ting Bi, Chenghang Ye, Zheyu Yang, Ziyi Zhou, Cui Tang, Jun Zhang, Zui Tao, Kailong Wang, Liting Zhou, Yang Yang(+1 more)

Abstract:Augmented Reality (AR) and Multimodal Large Language Models (LLMs) are rapidly evolving, providing unprecedented capabilities for human-computer interaction. However, their integration introduces a new attack surface for social engineering. In this paper, we systematically investigate the feasibility of orchestrating AR-driven Social Engineering attacks using Multimodal LLM for the first time, via our proposed SEAR framework, which operates through three key phases: (1) AR-based social context synthesis, which fuses Multimodal inputs (visual, auditory and environmental cues); (2) role-based Multimodal RAG (Retrieval-Augmented Generation), which dynamically retrieves and integrates contextual data while preserving character differentiation; and (3) ReInteract social engineering agents, which execute adaptive multiphase attack strategies through inference interaction loops. To verify SEAR, we conducted an IRB-approved study with 60 participants in three experimental configurations (unassisted, AR+LLM, and full SEAR pipeline) compiling a new dataset of 180 annotated conversations in simulated social scenarios. Our results show that SEAR is highly effective at eliciting high-risk behaviors (e.g., 93.3% of participants susceptible to email phishing). The framework was particularly effective in building trust, with 85% of targets willing to accept an attacker's call after an interaction. Also, we identified notable limitations such as ``occasionally artificial'' due to perceived authenticity gaps. This work provides proof-of-concept for AR-LLM driven social engineering attacks and insights for developing defensive countermeasures against next-generation augmented reality threats.

Via

Access Paper or Ask Questions